Bag-of-concepts: Comprehending document representation through clustering words in distributed representation

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Locally Weighted Bag of Words Framework for Document Representation

The popular bag of words assumption represents a document as a histogram of word occurrences. While computationally efficient, such a representation is unable to maintain any sequential information. We present an effective sequential document representation that goes beyond the bag of words representation and its n-gram extensions. This representation uses local smoothing to embed documents as ...

متن کامل

Bag-of-Concepts Document Representation for Textual News Classification

Automatic classification of news articles is a relevant problem due to the large amount of news generated every day, so it is crucial that these news are classified to allow for users to access to information of interest quickly and effectively. Traditional classification systems represent documents as bag-of-words (BoW), which are oblivious to two problems of language: synonymy and polysemy. T...

متن کامل

Rich Document Representation for Document Clustering

In traditional document clustering models, a document is considered as a bag of words. In this paper we present a new method for generating feature vectors, using the sentence fragments that are called logical terms and statements, in PLIR system. PLIR is a Knowledge-Based Information system based on the theory of the Plausible Reasoning. We have conducted a number of experiments using OHSUMED ...

متن کامل

More than Bag-of-Words: Sentence-based Document Representation for Sentiment Analysis

Most sentiment analysis approaches rely on machine-learning techniques, using a bag-of-words (BoW) document representation as their basis. In this paper, we examine whether a more fine-grained representation of documents as sequences of emotionally-annotated sentences can increase document classification accuracy. Experiments conducted on a sentence and document level annotated corpus show that...

متن کامل

Discriminant Bag of Words based representation for human action recognition

In this paper we propose a novel framework for human action recognition based on Bag of Words (BoWs) action representation, that unifies discriminative codebook generation and discriminant subspace learning. The proposed framework is able to, naturally, incorporate several (linear or non-linear) discrimination criteria for discriminant BoWs-based action representation. An iterative optimization...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neurocomputing

سال: 2017

ISSN: 0925-2312

DOI: 10.1016/j.neucom.2017.05.046